Abstract: Nowadays, data mining techniques are widely used in various industries. The Social Security Insurance Audit, with more than two million audit cases nationwide and limited manpower, cannot respond to all cases. The present research, using data mining techniques, aims to propose an efficient method for the detection of high-risk companies and fraud in the insurance industry. Materials and Methods: After randomly collecting 2000 records from the Social Security Audit database in 2014 and extracting various features for each company, a pre-processing of the data was conducted. The data was then divided into the two categories of Training Set and Test Set and given to the following four algorithms: Neural Network, Decision Tree, Bayesian and Support Vector Machine. Ultimately the accuracy of each algorithm was measured by the confusion matrix. Findings: By comparing the four algorithms mentioned above, it can be seen that all of these algorithms can provide a proper level of accuracy, with the neural network algorithm showing 90.6% accuracy, was found to be the best algorithm for the identification of high-risk companies. The findings of this study enable the Social Security Audit to develop software which can identify companies with high insurance risk in order to achieve higher levels of efficiency in the face of current limitations in manpower.

Keywords: Data Mining; Insurance Audit; Fraud; Neural Networks.